perm filename SYS1[NUM,DBL] blob sn#143220 filedate 1975-02-12 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00011 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.DEVICE XGP
C00003 00003	.GROUP SKIP 5
C00004 00004	2CONTENTS*
C00005 00005	21. MOTIVATING ISSUES*
C00016 00006	22. TASK: GLOBAL LEVEL*
C00019 00007	23. TASK: ACTIONS  LEVEL*
C00033 00008	24. IDEAS: INTERNAL*
C00048 00009	25. PROTOCOL*
C00062 00010	26. TASK: INTERNAL LEVEL*
C00067 00011	27. TENTATIVE TIMETABLE*
C00078 ENDMK
C⊗;
.DEVICE XGP
.FONT 1 "NGR25"
.FONT 2 "SIGN57"
.FONT 3 "SHD40"
.FONT 4 "BDI25"
.FONT 5 "NGB30"
.FONT 6 "NGR20"
.FONT 7 "SGN114"
.TURN ON "↑↓_π{"
.TURN ON "⊗" FOR "%"
.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
.MACRO E ⊂ APART END ⊃
.TABBREAK
.PAGE FRAME 54 HIGH 75 WIDE
.PAGE FRAME 54 HIGH 77 WIDE
.GROUP SKIP 5
.BEGIN CENTER RETAIN
⊗7THEORY
FORMATION:⊗*

⊗6Some Stray Initial Thoughts on⊗*

⊗2A  SYSTEM  WHICH  CAN
DEVELOP  MATHEMATICAL
CONCEPTS  INTUITIVELY⊗*
.END
.GROUP SKIP 10
.NOFILL
⊗3DOUG LENAT

AVRA COHN


STANFORD UNIVERSITY
ARTIFICIAL INTELLIGENCE LABORATORY

⊗*


FIRST  SKETCH

⊗4Not for distribution⊗*




			⊗6Last modified 1/16/75⊗*
.NEXT PAGE
.ONCE CENTER
⊗2CONTENTS⊗*

.GROUP SKIP 10

.NOFILL
1. Overall motivating ideas
2. Task: Level 1: Global Considerations
3. Task: Level 2: How to Learn Arithmetic (once you can count)
4. Ideas for the internal structure of the system
5. Sample Protocol Sessions
6. Task: Level 3: The actual  internal structures  interacting
.FILL
.EVERY HEADING(⊗3MATH THEORY FORMATION⊗*,,⊗4Doug Lenat and Avra Cohn⊗*)
.EVERY FOOTING(⊗61/16/75,,page {PAGE})
.NEXT PAGE
⊗21. MOTIVATING ISSUES⊗*

It's easy to convince oneself of the primacy of intuition (over formal
methods) in mathematical activity - in math considered as a type of
intelligent behavior rather than as a finished product.  Often, formal-
ization, proof and axiomatization are imposed after the fact of discovery
and creation of mathematics, and the formalisms are apparently unrelated
to the essential intuitions.  They give no hint of the thought processes
that went into finding the proofs or choosing the axioms - guessing,
inferring, considering seemingly unrelated fragments, extracting relevant
features, noting analogies, conjecturing, planning for solutions, judging
according to experience, working backwards, changing approaches, restating
goals, etc.  

More to the point yet not very helpful, are descriptions of what mathematical
intuition is like (feels like):

   The problem is laid aside, so far as the conscious mind is
   concerned...Apparently during this period unconscious mental
   activity concerned with the problem continues; for suddenly
   an insight relating to the problem - perhaps a conplete
   solution - erupts into consciousness...           - Skemp

   You cannot compel helpful ideas to appear.  I take my problem
   earnestly.  I put it to myself.  I set it to myself.  I realize
   it keenly.  I become absorbed in my problem.  I am waiting
   for a helpful idea; will it come?  Perhaps at once, perhaps
   after some time, perhaps not at all...They [helpful ideas] are
   our masters and they are capricious and self-willed.  They may
   flash upon us unexpectedly...                    - Polya

Such descriptions are valid, of course; most people can confirm having had
like experiences.  But they mislead by giving the impression that there
isn't or cannot be an explanation or mechanism for the intuitive process.
They give no hint of how a math learning and doing system could be implemented
on computers - perhaps worse, they give no hint as to how people learn and
do math.

More suggestive and helpful are descriptions of the intuitive process that
allude to the importance of the organization of knowledge.  What is needed
for ideas to appear and occur, or for new information to be understood is
not only that the requisite facts and techniques are present, but that they
are accessible and available when they are appropriate.  Surprisingly,
there seems to be little discussion of the form and nature of mental
organization in the literature on math learning and teaching (where one
might expect to find it).

   We establish contact with an extensive layer of our formerly acquired
   knowledge, some streak of which might be useful now...   - Polya

   Of course, the more extensive your knowledge is and the better
   it is organized the more chance you have to find what you need.

   It [generalization] makes possible conscious, controlled and accurate
   accomodation of one's existing schemas, not only in response to
   the demands for assimilation of new situations as they are encountered,
   but ahead of these demands, seeking or creating new examples to fit
   the enlarged concept.                                    - Skemp

   Once a person is able to analyze new material for himself, he can
   fit it onto his own schemas in the ways most meaningful to himself
  which may or may not be the same ways as it was presented.


The conception of the project is to build a system that learns and does
mathematics by creating and maintaining and using "good" internal
organizations of it knowledge.  The self-organizing feature implies
automaticity, and hence the usefulness of automatic programmming.  
Information, ideas and concepts should be stored with an eye for
future use and connections with the existing structures, and accessed in
an intelligent way.


.GROUP SKIP 5
Elementary number theory as a domain has the (possible) advantages of
being non-visual to a large extent, fairly basic and, in parts,
hierarchically organized.

⊗5An analogy:⊗*
Artificial Intelligence is unable to use most psychological studies
on learning, because the latter deal typically with rote 
memorization, not with creative assimilation. The reason for this
is that the simple type of learning is more accessable to quantitative
investigation.  The same unfortunate dodge may be occurring in
automatic theorem proving. The difficult problems of creative invention
in mathematics have been avoided, and work has been on the mechanical
methods of deduction.  The tradeoff of discovery for completeness is
never made by successful research meathematicians.

Why choose mathematics as a domain?  The effort expended already on
concept-formation program writers has shown that even a seemingly
clean task domain presents massive dialogue problems. In math, much
of this goes away; communication is tolerated by both man and by
machine in a formal language (mathematical notation.)

.NEXT PAGE
⊗22. TASK: GLOBAL LEVEL⊗*

⊗5FACETS: Design considerations⊗*
.NOFILL

(1) The given, initally supplied facts and methods.
	The format
	The content

(2) The teacher's inputs
	Format
	Content

(3) What is "learned" 
	Format of internal new knowledge
	Content

(4) Abilities of the system to demonstrate its understanding
	Tests
	How easily it learns new information

(5) Methods of understanding: 
	assimilating teacher's inputs into internal formats
	meaningful interaction between relevant knowledge
.FILL

⊗5The given knowledge and abilities⊗*

The system will be provided with background information on set theory,
the meaning of definition, truth, falsity, proof, deduction, induction,
inference, aesthetics, and efficiency values. The internal representations
used will be devised by the system; there is room here for instruction
by the user, and (more exciting) the development of new ways of representing
mathematical concepts. For example, our exponential notation for numbers
lends itself to fast algorithmns for addition and multiplication; perhaps
there is a better representation for number-theoretic operations.

⊗5The role of the human in this system⊗*

There will be a human watching and helping the system. His roles will
include that of guide, teacher, and experimenter. The former roles will
be played when the system is off the track, or boged down; the latter
when the system is truly advanced.
There should be communication in a formal yet natural meta-mathematical
language, and also some non-linguistic modes of communication.

.NEXT PAGE
⊗23. TASK: ACTIONS  LEVEL⊗*

⊗5An example: a plausible chain of specific discoveries⊗*


Example of interacting modules of knowledge, discovering addition,
multiplication,exponentiation, primes, the unique factorization theorem,
greatest common divisor. Bear in mind that this is just one early idea
of one possible development, based on a tentative set of given concepts.
	The knowledge base is assumed to contain notions of set theory
(including joining by both UNION and by APPEND),
truth and falsity, proof, relations, and operators.  Specific
acquaintence is assumed with the relations of ordering and equality,
the concept of counting, and the operator COUNT  (which takes a set
and returns the number of elements in it). Despite this ability,
there is not necessarily any known way of representing numbers.
There are
assumed various heuristics known to the program: ways to prove things,
regularities to look for, ways of deciding how interesting an activity
is, how to know when to make a new definition, etc.

	Many orderings of activities, and ways of interacting, must be
possible.  One typical sketch of its actions might be as follows; note
the virtual absence of communication with the user.
	1. Consider the composition COUNT * APPEND.
	2. See that it is associative, commutative.
	3. Notice that the distinguished number zero and the distinguished
set NIL are associated as identities of this composition and of
APPEND, respectively.
	4. It is very desirable to have a function operate on entities in
a given domain and return a value in that same domain. One way to do this
(with function f:DxD → D already in this form, but function g:D→E) is to
consider g(f(g↑-↑1(x),g↑-↑1(y)), where g↑-↑1:E→D is the inverse of g. This is 
relevant when g * f is interesting, as in the current case. So the function
COUNT * APPEND is considered applied to two numbers, 
by applying it actually
to two sets which have those numbers of elements.  It is seen that
COUNT * APPEND * COUNT↑-↑1xCOUNT↑-↑1 is associative and commutative, and
is (by construction) from NxN → N.
	5. Try to relate this composition to the known function SUCCESSOR.
Notice that SUCCESSOR is the COUNT of the APPEND of a set and a singleton,
so SUCCESSOR(COUNT(x)) will 
equal COUNT(APPEND(x,y)) iff y is a singleton set,
a set whose count is 1. 
So SUCCESSOR(x) ≡  COUNT(APPEND(COUNT↑-↑1(x), COUNT↑-↑1(1))).
	6. Decide it's worth defining.  Give it a name, e.g., PLUS.
So our last statement would read SUCCESSOR(x) ≡ PLUS(x,1).
This is much simpler, and hence encouraging.
	7. Since SUCCESSOR(x) can range over all nonzero numbers, so can
PLUS(x,1).  Also, notice that PLUS(0,0) is 0. So PLUS can assume any 
value in N. 
	8. A corollary of (7) is that any number can be viewed as
repeated additions of 1 to itself. In particular, view PLUS as operating
on a BAG. Then any number n is PLUS of a bag containing n 1's.
	9. PLUS is thus repeated SUCCESSOR-ing. What would it mean to
do repeated PLUS-ing?  That is, consider a function M:NxN → N, where
M(x,y) is the same as doing PLUS on a bag conaining y numbers, each of 
them being x. So M(x,y) ≡ PLUS(x,x,...<y times>,...,x). 
	10. The new function has good domain/range, is 
associative, and is commutative.
M(x,1) = x = PLUS(x,0), so the distinguished elements 1,0 are the
identities of M, PLUS, respectively. This is also a good sign.
	11. The function is worth naming, call it TIMES.  Now try to see
how TIMES can be directly interpreted in the language of sets. It seems
that TIMES(x,y) means one has a set with x elements, each of which is a
set with y elements, and one is removing the boundaries of the inner
sets, then counting the whole collection.  The commutativity is clearly
seen by arranging the elements into a matrix of x rows and y columns;
turning this sideways resluts in amatrix of y rows and x columns, but
obviously the number of objects in toto has been conserved.
	12. Once TIMES is accepted, new compositions open up for study.
Consider PLUS(x, TIMES(y,z)). Work a while, then give up;
perhaps find: PLUS(x,TIMES(x,y)) = 
TIMES(SUCCESSOR(x),y) = TIMES(PLUS(x,1),y).
	13. Consider PLUS(TIMES(x,y), TIMES(u,v)). 
Again, the only result
found will probably be when x=u. Then 
PLUS(TIMES(x,y),TIMES(x,v)) = TIMES(x, PLUS(y,v)). AHA! This was the
very next composition to be investigated, TIMES * PLUS.  This rule is
probably worth naming, call it DISTRIB.
	14. The commutativity of PLUS and of TIMES indicate how to show 
that they are not
1-1. PLUS↑-↑1(3) contains (2,1) and (1,2), so this is true. 
The associativity
indicates that it probably isn't even unique 
to within rearrangements of
the arguments, viewing PLUS as operating on a BAG. This is also true,
since PLUS↑-↑1(3) contains (1,2) and (0,3).  Similarly for TIMES↑-↑1(4) 
containing (1,4) and (2,2).
	15. Further properties may surface by considering statements
and relations involving ordering, (<, ≤). For example,
PLUS(x,y) ≥ x.  TIMES(x,y) ≥ PLUS(x,y) iff x and y are > 0.
TIMES(x,y) is usually bigger than PLUS(x,y). The
execeptions are if x or y is
0 or 1, or if x=y=2. Thus 2 is distinguished now.
	16. Higher order operations are considered. E(x,y) is defined
as TIMES operating on a bag of x elements, each of them a y. But even
at this point, associativity and commutativity fail, so there is no
reason to probe too hard. Probably the relation to be discovered is
TIMES(E(x,y), E(x,z)) = E(x, PLUS(y,z)), and perhaps also
E(E(x,y),z) = E(x, TIMES(y,z)).  
There may be some interesting relations
involving hyper-exponentiation and hyper..., but it is unlikely.
	17. Once a representation for numeration is chosen, work can be
directed toward efficient algorithms for doing PLUS and TIMES. 
Perhaps the problem should be reversed. 
We don't want to have to revert to
counting sets to actually compute 8↑4 = 4096 in unary! 
So how should a system
of numeration be devised that would allow us to quickly compute PLUS
and TIMES of any numbers, perhaps at the cost of memorizing a finite
set of facts?
	18. Although the inverses of PLUS and TIMES were not singletons,
the COUNT of these sets may follow some regular pattern. 
COUNT * TIMES↑-↑1 is the number of ways to express a number as a product
of two numbers. It will always include the number and 1, as one pair.
The system may decide to name the numbers whose value of 
COUNT * TIMES↑-↑1
is 1: PRIMES. TIMES may also be viewed as operating on a BAG; in this
sense, TIMES↑-↑1 would return all possible bags whose product is the
number. The PRIMES stay the same!  The sys. notices that for 
every number, its image under COUNT * TIMES↑-↑1
contains a bag containing only primes, and only one such bag. 
This is the unique factorization theroem, and the sys. has proposed it
intuitively!
	19. My definition
of BAG implicitly assumes that if any number of α's can be added to the
bag without altering its properties, then a minimal number of them are
actually added. Thus PLUS never operates on zero unless there are less
than two nonzero operands; similarly for TIMES and 1. 
TIMES↑-↑1(0) is quickly
isolated as a very special beast, 
and not included in any statements involving
TIMES↑-↑1. 
	20. Consider applying PLUS * TIMES↑-↑1, that is, add up each
bag of factors.  Perhaps choose a name for those numbers whose
images contain a bag whose sum is < the number, one for those
containing a bag whose sum is = the number. Similarly, consider
COUNT * UNION * TIMES↑-↑1, the total number of factors of a number.
PLUS * UNION * TIMES↑-↑1 is the sum of all the divisors of the number.
This will always range from SUCCESSOR of the number to below
TIMES of the number and itself. Those at the low end are the PRIMES.
Those at PLUS(x,x) = TIMES(x,2) may be named (PERFECTS). 
Those near the high end are
not so well grouped, and may not be named.
	21. Consider comparing TIMES↑-↑1 collections from two numbers.
Clearly UNION * TIMES↑-↑1(x) is a subset of UNION * TIMES↑-↑1 (TIMES(x,y))
for any nonzero y.  When does equality occur? Some cases are when
y is 1, when y is a member of the first set. Are these all? Why?
	22. Compare as above. What is the
INTERSECTION of these two sets? That is, what factors do the two
numbers have in common?   Study this for many cases. Eventually
consider that this intersection set is
UNION * TIMES↑-↑1 of the largest member of the intersection set.
That is, the largest common factor determines all the others!
	23. This is worth a name; call it GCD. The GCD of a prime
and any number is either 1 or the prime, and the second case occurs
iff the prime is a factor of the second number.  In general, two 
numbers may have GCD 1 with neither being prime. Explore this.
	
.NEXT PAGE
⊗24. IDEAS: INTERNAL⊗*

⊗5Ideas about the representation of knowledge⊗*

There seems to be a unifying concept, embarrassingly simple, which
subsumes BEINGS and RULES.  This is merely the idea of pointer
structures; each node of information would be part of several distinct
trees. 

Example: "The associative law over N" is a piece of knowledge. It is
pointed to by ⊗4N⊗* and in turn it points to ⊗4BAG⊗*, in the tree of
⊗4property-of⊗* pointers. It points to the function ⊗4REVERSE⊗* in the
structure dealing with ⊗4execution⊗*.  

The PUP6 BEING community can be viewed as thirty pointer structures,
connecting a hundred knowledge nodes. Each node could be expressed as
a packet of knowledge.  Perhaps a better way to look at it is to conceive
of each part of each BEING as a node. Then the structures for each part
are fairly disjoint. There is one new system of pointers, tying each
set of parts of a BEING together. 

One reason this was not obvious is that most of the links are virtual:
they are computed at run time by pattern-matching, by saying "who should
I link to?"  

Perhaps a better way of looking at the 
knowledge is to keep it unstructured
(unlike BEINGs, which make one pointer structure explicit) physically, and
to have methods (some simple, some sophistocated) for ascertaining the
corpus of information relevant in any given situation. That is, perhaps
one should ⊗4not⊗* fix a set of BEING parts at all!

Predicate calculus typically is organized this way, but has no methods at
all for deciding which knowledge is relevant and which isn't (except to
try and use it!) 

Some rule-based systems have explicit pointer structures; many translation
systems use that idea. 

Further thought should be given to any advantage 
of viewing a non-deterministic
pattern-invoked transfer of attention as if it were following a virtual
pointer in a system relevant to the particular match desired.

⊗5One possible internal organization of knowledge⊗*

Each module of knowledge would be a BEING. The specific set of BEING
parts would differ from those used in PUP6.

Conjectures are formed by analogy, definitions are made on the basis
of aesthetics, and plausibility is determined by empirical testing,
by analogy, and by desirability. Also, the user may direct any of 
these activities.

Each new concept would mean the creation of a new BEING. Part of the
drive, the aestheitics, would be to ⊗4complete⊗*  each BEING.

Taking an idea from CLASSES, we group the BEINGS into communities,
say into OBJECT, OPERATOR, CONCEPT categories.


.FILL
.NEXT PAGE
⊗25. PROTOCOL⊗*

These excerpts don't reflect what the system ⊗4should⊗* do, but merely
are plausible paths it might follow. They give the flavor of the dialogue,
and point out problems which must be attended to.

⊗5First rough attempt: 11/4/74⊗*

S: I AM AWARE OF SEVERAL OPERATORS, WHO THINK THEY MAY BE RELATED.
  I AM GOING TO TRY TO FILL IN THE RELATIONSHIPS BETWEEN THEM.
  COMMENTS/QUESTIONS?

U: I wonder if you could tell me what the operators of interest are?

S: TRANSLATED AS: WHAT OPERATORS?
  ANSWER: APPEND, UNION, INTERSECT, SETDIFFERENCE, COMPLEMENT, COUNT, EQUAL

U: Tell me whether you know what "composition" of operators means.

S: TRANSLATED AS: DEFINED "COMPOSITION"?
  ANSWER: GIVEN TWO FUNCTIONS, IF THE DOMAIN OF THE FIRST INTERSECTS WITH
  THE RANGE OF THE SECOND, IN A SUBSET S, THEN THE COMPOSITION MAY BE
  DEFINED OVER THE PART OF THE DOMAIN OF THE SECOND FUNCTION WHICH IS THE
  PREIMAGE OF S, TO THE PART OF THE RANGE OF THE FIRST FUNCTION WHICH IS
  THE IMAGE OF S, AS FOLLOWS:   TO COMPUTE THE COMPOSITION AT XεS,
  APPLY THE SECOND FUNCTION TO THE VALUE OF THE FIRST FUNCTION APPLIED
  TO X.

U: This is my notation for the composition of f and g (two functions):

   g (f (x)).  f is applied first, and I would like to call it the first function, or the 
   innermost.  Okay?

S: YOUR "FIRST FUNCTION" CORRESPONDS TO MY "SECOND FUNCTION."
  YOUR "SECOND FUNCTION" CORRESPONDS TO MY "FIRST FUNCTION". THIS HAS BEEN
  ASSIMILATED AND CHANGED.  I CANNOT ACCEPT YOUR NOTATION. HOW WOULD
  IT BE POSSIBLE TO REPRESENT G APPLIED TO F APPLIED TO X?

U: As g (f (x)).

S: THEN HOW COULD I DISTINGUISH THIS FROM THE COMPOSITION OF F AND G?

U: Tell me what you think the difference between the composition of f and 
   g and g (f (x)) is.

S: THE FORMER IS AN OPERATOR AND THE LATTER IS AN ELEMENT OF THE RANGE OF G

U: I would rather denote the former by g o f.

S: OK.  WHAT DOES THIS HAVE TO DO WITH THE ORIGINAL QUESTION?
  POSSIBLE ANSWER: TRY TO RELATE THE EXISTING OPERATORS BY COMPOSITION.

U: That is right.  Pay special attention to COUNT as the second function
   (in my sense).

S: COUNT o UNION WAS INVESTIGATED; NOTHING INTERESTING.
  COUNT o APPEND IS BEING INVESTIGATED.  THIS FUNCTION IS ASSOCIATIVE.
  THIS IS COMMUTATIVE.  THE FUNCTION SEEMS INTERESTING, BUT ITS DOMAIN AND RANGE ARE DISTINCT.
  INCT.  SHOULD I WORK TO TRY TO CORRECT THIS, PERHAPS ALSO BY COMPOSITION?

U: What do you mean by the domain and range being distinct? (Perhaps an example?)

S: THE COMPOSITION   COUNToAPPEND MAPS TWO ELEMENTS FROM THE DOMAIN "LISTS"
TO ONE ELEMENT OF THE DOMAIN "NUMBERS".  IF THESE WERE THE SAME, ONE COULD
INVESTIGATE REPEATED APPLICATIONS OF THIS FUNCTION. ALL THE INTERESTING FUNCTIONS
I KNOW ARE OF THIS FORM, FOR EXAMPLE APPEND (TWO LISTS → ONE LIST).

U: Continue to investigate that, and report your results to me.

S:  I CAN CONSTRUCT A FUNCTION (LAMBDA (X,Y) (COUNT (APPEND (COUNT↑-↑1 X)
							    (COUNT↑-↑1 Y))))
  THIS MAPS TWO NUMBERS INTO ONE NUMBER.  I CAN CONSTRUCT
THE ANALOGOUS FUNCTION WITH COUNT↑-↑1 AT THE FRONT, BUT 
THIS IS TOTALLY UNINTERESTING.   
I AAM INVESTIGATING THE FIRST POSSIBILITY.  THIS IS ASSOCIATIVE.

 THIS IS COMMUTATIVE.
  THIS TIES IN WITH SUCCESSOR: SUCCESSOR OF A NUMBER IS THE SAME AS
APPLYING THIS COMPOSITION TO THAT NUMBER AND 1.
  THIS WARRANTS GIVING THE COMPOSITION A NAME. WHAT IS ITS NAME:

U: We will call it ADD.  Can you tell me (i) What you mean by COUNT↑-↑1, and
   (ii) What is the explicit relationship between SUCCESSOR and ADD.

S: COUNT↑-↑1 MAPS A NUMBER ONTO A SET WHICH HAS A COUNT OF THAT NUMBER. THE
  LIST IS IN CANONICAL FORM, A HEAD OF THE FOLLOWING LIST: (X X X X X ...).
  (II). A RETRANSLATION OF THE PRECEDING CONCLUSION:

  ∀XεNUMBERS,  SUCCESSOR(X) = ADD(X,1)

  ALSO:  I NOW WANT TO SEE IF THERE IS ANYTHING ABOUT THE PROPERTIES SPECIAL
  TO SUCCESSOR WHICH WILL TELL ME SOMETHING SPECIAL ABOUT ADD. OK?

U: Yes.  First tell me what properties of SUCCESSOR you are considering.

S: THE SUCCESSOR OF ANY NUMBER CAN BE EXPRESSED AS
  SUCCESSOR(SUCCESSOR(...(SUCCESSOR(0)...)).
  THE SUCCESSOR OF A NUMBER IS > THAT NUMBER.
  EVERY NUMBER HAS A SUCCESSOR.
  EVERY NUMBER EXECPT ZERO IS THE SUCCESSOR OF PRECISELY ONE NUMBER.

   SUCCESSOR IS A FUNCTION.

U: In the first claim, the ellipsis notation doesn't tell me how many applications
   are intended.  Do you know?

S: NO. SHOULD I WORK ON THAT OR ON THE EARLIER IDEA OF APPLYING SOME OF
  THESE FACTS TO ADD?

U: First, consider the ellipsis notation problem.  Then the other one.

S: THERE SEEM EMPIRICALLY TO BE X APPLICATIONS OF
  SUCCESSOR IN THE IDENTITY X = SUCCESSOR(...(SUCCESSOR (0)).
  NOW I AM INVESTIGATING THE SPECIAL KNOWLEDGE.

  THE FIRST PIECE OF KNOWLEDGE TRANSLATES READILY INTO:
THE SUCCESSOR OF ANY NUMBER X CAN BE EXPRESSED AS 
(ADD 1 (ADD 1 (... (ADD 1 0) ...))), WHERE THERE ARE X APPLICATIONS OF ADD.
SINCE THE SUCCESSOR OF ANY NUMBER X IS (ADD X 1), THIS SAYS THAT, FOR ANY X,
 SINCE ADD IS ASSOCIATIVE, (ADD X 1) = (ADD 1 1 ... 1 0) WHERE THERE ARE 
  ADD(X 1) 1'S  BEING ADDED TOGETHER IN THE ELLIPSIS. 
THIS SEEMS TO INDICATE THAT Y IS EQUAL TO Y 1'S ADDED TOGETHER. YES, THIS
  CHECKS OUT.  THIS INCLUDES THE CASE OF 1 = ADD(1 0).  FROM THIS WE
   CONCLUDE THAT 0 IS THE IDENTITY OF ADD.

 ........

.GROUP SKIP 10
⊗5Skipping further ahead, toward discovering REMAINDER.   11/4/74.⊗*

U: Suppose we have two numbers a and b such that a does not divide b.  
   Can you see anything interesting in that relation?  As examples, consider
   the pairs 2 and 7, 3 and 7, 4 and 7, if you like.

S: I GENRALLY PREFER TO CONSIDER HUNDREDS OF EXAMPLES, SO I AM IGNORING
  YOUR LAST SUGGGESTION.  NOW, AS TO THE FORMER...

 IT SEEMS THAT, FOR EACH X, ALL NUMBERS > X DO NOT DIVIDE X. THUS IF
 A DOES NOT DIVIDE B, THERE IS STRONG EVIDENCE THAT A > B.
  
 I FIND A COUNTEREXAMPLE FOR THAT; BUT IT WILL BE KEPT AS A PLAUSIBLE
 INFERENCE.
 THUS, GIVEN UNEQUAL A AND B, PRECISELY ONE OF THE FOLLOWING TWO CASES HOLDS:
	A DIVIDES B MAKES SENSE TO ASK
 	B DIVIDES A MAKES SENSE TO ASK.
  I PROPOSE A NEW DEFINITION, THE CONCEPT (LAMBDA (X Y) (OR
						         (X|Y)(Y|X))
  I AM INVESTIGATING THIS NEW IDEA. WHAT IS ITS NAME?

U:  Let us call it MUTUAL DIVISIBILITY, abbreviated MD.  However, it is more
   interesting to put x > y in the hypothesis.  Can you restate your concept
   with this hypothesis?

S: IF X>Y THEN MD(X,Y) = Y|X.  THAT IS, UNDER THIS HYPOTHESIS, MD= |.

  SO?

U: MD as a concept is probably not interesting.  Can you go on investigating
   the relation DOES NOT DIVIDE?

S: I AM CONSIDERING 810. THE NUMBERS WHICH DONOTDIVIDE IT ARE.......
  HMMM. THIS TIES IN TO THE UNIQUE FACTORIZATION THM:
  THE NONDIVISORS ARE THE PRIME NONDIVISORS AND ANY OF THEIR PRODUCTS.
  THE IDENTITY OF THE NONDIVISORS IS THUS SOLVED.

U: Consider the greatest number c such that c < b and a|c. 

S: SO C = QxA FOR SOME Q. ALSO, B =(ADD C R) FOR SOME R.
SO B = ADD(QxA, R).  THIS MAKES SENSE IF SUCH A C EXISTS. THIS IN TURN
IS TRUE ALWAYS BECAUSE A FINITE SET OF NUMBERS HAS A LARGEST NUMBER.

 LET US SEE IF Q AND R ARE UNIQUE.
  THEY ARE.

  SINCE WE WERE JUSST THINKING OF THE UFT, LET US CONSIDER THE EXPANSIONS
OF EACH OF THESE NUMBERS: SAY
 A = πP↓i↑α↑(↑i↑)
 B = ... W...
 Q = ... V ...
 THEN C = AQ = ...P V.....
 AND R = ... U ...

  THIS DOESN'T GET US ANYWHERE.

  .......

The lessons from these two pieces are not completely expected:
(i) The system must understand a  much larger amount of input syntax than
was anticipated, if the dialogue is to seem natural. 
This poses no ⊗4conceptual⊗* difficulty.
(ii) The system seems like it could very easily get sidetracked into
investigating the consequences of barren definitions. This might be a
serious handicap to advanced discovery; at the lower levels, a teacher
can guide the system out of such bogs.
(iii) The problem of precisely what to print out to the user occurs here.
Too much is bewildering and boring. Too little is dangerous. How exactly
should the user be able to direct the system? How does the level of 
initiative get sensed (when the system should listen, should act, should
direct.)

.NEXT PAGE
⊗26. TASK: INTERNAL LEVEL⊗*


⊗5What modules will be needed for this initial task sequence?⊗*

The BEINGs may be roughly categorized along two axes.
First, as OBJECT/OPERATOR/CONCEPTS, and second as
to the level on which they operate: META/WORKING/BASIC.
.NOFILL


	OBJECTS			OPERATORS		CONCEPTS

META	Human			Create-New-Being	Definition
				Complete-a-Being	Aesthetics
							Efficiency
							Plausibility
							Representation

WORKING	Singleton		Subset			Proof
    	Ordered-Pair         	Ordering	
                       		Induction
				Deduction

BASIC	Atom			∪			Truth
	Bag			∩			Falsity
	Tuple			Append
	Set			Set-difference
				Set-complement


The scheme of development might be:

SUBSET
SET-EQUALITY
EQUIVALENCE RELATION PROPERTIES ARE SATISFIED
PARTITION ALL SETS USING SET-EQUALITY
SUCCESSOR-SET OF A SET
AN ELEMENT OF ANY PARTITION CAN BE CONSTRUCTED BY REPEATED SUCCESSORING
DECIDE ON A STANDARD REPRESENTATIVE FROM EACH PARTITION, USING THIS IDEA
NUMBERS (UNARY REPRESENTATION): COMPOSE LIST OF N "SUCCESSOR"'s and 1 "0".
COUNTING A SET:  put it in its canonical representation
COUNToAPPEND. Note this equals APPENDo(COUNTxCOUNT). Note COUNT↑2 = COUNT.
PLUS. This takes two sets, and does COUNTING and UNIONING (in either order.)
ZERO←→NULL SET, ONE←→SINGLETON, ASSOCIATIVITY AND COMMUTATIVITY OF PLUS.
IDENTITIY OF ZERO:  PLUS(x,ZERO) =  COUNT(x).
RELATION BETWEEN PLUS AND SUCCESSOR. PLUS(x,ONE) = COUNT(SUCCESSOR(x)).
ANY NUMBER CAN BE CONSTRUCTED BY REPEATED PLUSSING OF 1.
REPEATED PLUSSING OF n:  PLUS(n,n,n,...,n)
TIMES of two sets A, B. Let their counts be X and Y. Then
 the value of TIMES is PLUS(X,X,X,...,X), where the COUNT of that bag is Y.
ASSOCIATIVITY, COMMUTATIVITY OF TIMES. 
IDENTITY ELEMENT.  NIL FOR ∪, ⊗4U⊗* FOR ∩, zero FOR PLUS, one FOR TIMES.
DISTRIBUTIVE LAW
EXPONENTIATION.  repeated TIMESing of n with itself.
F(2,2) = 4, WHERE F IS PLUS, TIMES, EXPONENTIATE, HYPEREXPONENTIATE,...
NON-ASSOCIAITIVITY. STOP THIS PARTICULAR EXTENSION SCHEME.
INVERSE OPERATION
PROPERTIES OF PLUS↑-↑1
PROPERTIES OF TIMES↑-↑1, EXPONENTIATE↑-↑1
COMPOSITIONS INVOLVING INVERSES: COUNToTIMES↑-↑1, PLUSoTIMES↑-↑1,
PRIME NUMBERS
UNIQUE FACTORIZATION THEOREM
COMPARE UNIONoTIMES↑-↑1 OF TWO NUMBERS.    ∩ OF THESE TWO SETS.
THIS ∩ IS DETERMINED BY ITS LARGEST MEMBER:  GCD.
RELATIVE PRIMENESS
from here, follow curriculum in standard number theory text.
DIVISIBILITY THEORY (LCM, Division, Remainder, Euclidean Algorithm,
	congruence, TAU, sieving)
NUMERICAL FUNCTIONS (multiplicative fns., perfect nos., MU)
ARITHMETIC OF CONGRUENCE CALSSES (PHI, Residue Systems)
SOLVING CONGRUENCES (Chinese Remainder Thm., Quadratic Residues)
ADVANCED TOPICS (Distribution of primes into congruence classes,
	primitive roots, quadratic reciprocity)

.NEXT PAGE

⊗27. TENTATIVE TIMETABLE⊗*

.NOFILL

Wed-Thu, Dec. 11-12: Work on further simulating the system using the given knowl.
Over vacation: Avra: fill in the three examples in the SYS.3 file
	contemplation, analogy, proof
	       Both: run simulations with human subjects
	check for some gross omissions from the design
		     Also, consider the intuitions of specific concepts
January: Firm up the design, fill out the specific intuitions of each concept
	 Decide on the precise choice algorithms, control mechanisms, use of
			 BEINGs, rules, demons. 
	 Select the classes of BEINGs, and each's BEING parts
 	 Firm up the specific facts present about each concept
		EXACTLY what is usable but opaque, usable and transparent
	 Formal proposal writeup, choose thesis committee
February: Run tests with several subjects, try out system, learn the requirements
		of the language portion, tune up the knowledge, see about a user
		model, consider the size and time involved in an actual program
March: Begin designing the actual program, choosing formats and representations
April: Programming of the system is underway
May: First early results from the system, contemplative phase "debugging"
June: FIrst analogical and proof results, though these stages need more work
	Contemplative phase (and parameters affecting it) is now debugged